Optimizing the automated recognition of individual animals to support population monitoring

Abstract Reliable estimates of population size and demographic rates are central to assessing the status of threatened species. However, obtaining individual‐based demographic rates requires long‐term data, which is often costly and difficult to collect. Photographic data offer an inexpensive, noninvasive method for individual‐based monitoring of species with unique markings, and could therefore increase available demographic data for many species. However, selecting suitable images and identifying individuals from photographic catalogs is prohibitively time‐consuming. Automated identification software can significantly speed up this process. Nevertheless, automated methods for selecting suitable images are lacking, as are studies comparing the performance of the most prominent identification software packages. In this study, we develop a framework that automatically selects images suitable for individual identification, and compare the performance of three commonly used identification software packages; Hotspotter, I3S‐Pattern, and WildID. As a case study, we consider the African wild dog, Lycaon pictus, a species whose conservation is limited by a lack of cost‐effective large‐scale monitoring. To evaluate intraspecific variation in the performance of software packages, we compare identification accuracy between two populations (in Kenya and Zimbabwe) that have markedly different coat coloration patterns. The process of selecting suitable images was automated using convolutional neural networks that crop individuals from images, filter out unsuitable images, separate left and right flanks, and remove image backgrounds. Hotspotter had the highest image‐matching accuracy for both populations. However, the accuracy was significantly lower for the Kenyan population (62%), compared to the Zimbabwean population (88%). Our automated image preprocessing has immediate application for expanding monitoring based on image matching. However, the difference in accuracy between populations highlights that population‐specific detection rates are likely and may influence certainty in derived statistics. For species such as the African wild dog, where monitoring is both challenging and expensive, automated individual recognition could greatly expand and expedite conservation efforts.

Photographic data offer an inexpensive, noninvasive method for individual-based monitoring of species with unique markings, and could therefore increase available demographic data for many species. However, selecting suitable images and identifying individuals from photographic catalogs is prohibitively time-consuming. Automated identification software can significantly speed up this process. Nevertheless, automated methods for selecting suitable images are lacking, as are studies comparing the performance of the most prominent identification software packages. In this study, we develop a framework that automatically selects images suitable for individual identification, and compare the performance of three commonly used identification software packages; Hotspotter, I 3 S-Pattern, and WildID. As a case study, we consider the African wild dog, Lycaon pictus, a species whose conservation is limited by a lack of cost-effective large-scale monitoring. To evaluate intraspecific variation in the performance of software packages, we compare identification accuracy between two populations (in Kenya and Zimbabwe) that have markedly different coat coloration patterns. The process of selecting suitable images was automated using convolutional neural networks that crop individuals from images, filter out unsuitable images, separate left and right flanks, and remove image backgrounds. Hotspotter had the highest image-matching accuracy for both populations. However, the accuracy was significantly lower for the Kenyan population (62%), compared to the Zimbabwean population (88%). Our automated image preprocessing has immediate application for expanding monitoring based on image matching. However, the difference in accuracy between populations highlights that population-specific detection rates are likely and may influence certainty in derived statistics. For species such as the African wild dog, where monitoring is both challenging and expensive, automated individual recognition could greatly expand and expedite conservation efforts.

| INTRODUC TI ON
Reliable estimates of population size and demographic rates are central to monitoring the status of threatened species. However, obtaining individual-based demographic parameters require long-term data, gathered through intensive monitoring that is often costly and difficult to conduct (Caughlan, 2001;Horswill et al., 2018).
Identification of individuals from photographic records could provide an inexpensive alternative, and open up the possibility of using camera traps and citizen scientists to expand the spatial coverage of monitoring (Marnewick et al., 2014;Seber, 1965;Wearn & Glover-Kapfer, 2019). This method can be used for species where individuals can be identified from individual markings, including many threatened species (Durant et al., 2014;Pierce & Norman, 2016).
Photographic records have already been used to estimate demographic parameters in several endangered species. For example, long-term photographic data have been used to obtain survival and abundance estimates of tigers, Panthera tigris, and cheetahs, Acinonyx jubatus, (Karanth & Nichols, 2011;Kelly et al., 1998), and tourist images have been used to estimate population sizes of whale sharks, Rhincodon typus (Davies et al., 2013). In addition, photographs can provide data on individual movement, ranging behavior, and social structure (Armstrong et al., 2019;Randić et al., 2012). Many species are photographed frequently as part of monitoring programs, and by members of the public, including tourists. Such image catalogs therefore represent a large, and potentially underused data resource that inform conservation action.
Nevertheless, visually identifying all individuals in large image databases is time-consuming. To partly automate this process, several software packages are available to match images based on an individual's unique body markings (e.g., APHIS and WildID, Bolger et al., 2012;Óscar et al., 2015). These image-matching software packages assist the user by ranking potential image matches using a similarity score. The algorithms underpinning the software packages find these potential matches by comparing images on either a pixel-by-pixel or feature basis. Pixel-based algorithms, such as APHIS, have been successfully applied to numerous species, including horseshoe whip snakes, Hemorrhois hippocrepis, and Balearic lizards, Podiarcis lilfordi (Óscar et al., 2015;Rotger, 2019). However, they are susceptible to differences in camera angle, scale, and cropping (Matthé et al., 2017), and are therefore unsuitable for animals that cannot be caught and photographed using a standardized methodology. By contrast, feature-based software packages, such as WildID (Bolger, 2012), I 3 S-Pattern (Reijns, 2014), and Hotspotter (Crall et al., 2013), match images based on unique features including spots, stripes, blotches, or other marks. The algorithms that feature-based packages use vary, but all have a higher tolerance to differences in camera angle, scale, and lighting conditions than pixelbased algorithms.
The feature-based packages have been tested on a range of taxa (Table 1), and the reported proportion of true matches that the software detects, that is, accuracy rate, varies markedly, ranging between 36% and 100%. This variation can be attributed to differences in species markings, image quality, size of database, how many potential matches were inspected per image, and the imagematching software used (Crall et al., 2013;Matthé et al., 2017;Nipko et al., 2020). Studies directly comparing the accuracy of different feature-based packages are considerably more limited, even though the most accurate software differs between species. For example, studies on jaguars, Panthera onca, ocelots, Leopardus pardalis, and Saimaa ringed seals, Phoca hispida saimensis, found that Hotspotter outperformed WildID (Chehrsimin et al., 2018;Nipko et al., 2020), while studies on amphibian species found that WildID outperformed I 3 S-Pattern (Matthé et al., 2017;Nipko et al., 2020) and Hospotter (Morrison et al., 2016). The only study that directly compared all three software packages found that Hotspotter was superior to I 3 S-Pattern and WildID for identifying individual green toads, Bufotes viridis (Burgstaller et al., 2021). To date, studies comparing imagematching accuracy across all three software packages for a mammal species are lacking.
Although feature-based algorithms are better at matching images from different viewpoints than pixel-based algorithms, researchers are still required to select images that are suitable for identification, in that the distinctive marks must face the camera and must be clearly visible. Furthermore, when these suitable images are selected, the user has to crop the region of interest from the image.
For photos that only contain a single animal, this process can be completed in less than 10 s. However, photographs of group-living animals are likely to contain multiple animals, some of which might be suitable for identification, while others might not. Consequently, manually selecting animals whose marks are clearly visible and then cropping these can take minutes per photo if photos contain a large number of animals. This laborious process is potentially preventing the application of image-matching software to large image catalogs (Miguel et al., 2019). Parham et al. (2018)  that machine learning methods have for automating this process, although it has only been automated for a few species.
African wild dogs, Lycaon pictus, (hereafter "wild dogs") have unique coat markings, which vary between individuals (Figure 1, Maddock & Mills, 1994). Wild dogs are classified as globally endangered, and a lack of cost-effective large-scale monitoring has been highlighted as a major limitation in developing effective conservation strategies (Woodroffe & Sillero-Zubiri, 2020). Consequently, there is a pressing need to devise new approaches for monitoring wild dogs. Demographic processes of African wild dogs are typically studied by monitoring a subset of individuals fitted with tracking collars (Jenkins et al., 2015;Rabaiotti & Woodroffe, 2019;Woodroffe et al., 2019). Such collar-based monitoring is labor-intensive and expensive, so upscaling is difficult. However, many wild dog packs have already been systematically photographed as part of monitoring programs, and many are also regularly photographed by tourists. Therefore, photographic identification of wild dogs potentially offers a noninvasive, cheaper approach for monitoring, and could reduce uncertainties in demographic rates and expand the spatial representation of monitoring (Maddock & Mills, 1994;Marnewick et al., 2014).

Number of inspected ranks per image Reference
Jaguar ( shown to effectively match images of the same individual, reaching accuracy rates of up to 97% (Burgstaller et al., 2021;Matthé et al., 2017). Therefore, it is likely that feature-based image-matching algorithms will effectively identify individual wild dogs from image catalogs. However, variation in the degree of contrast in the color patterns among populations could affect the image-matching accuracy, and the best performing software package could therefore also vary between populations.
In this study, we develop a method to automatically isolate and crop images from catalogs that are suitable for automated image matching. We then use these images to compare the efficacy of three feature-based software packages with different underlying image-matching algorithms (I 3 S-Pattern, Hotspotter, and WildID; Bolger, 2012;Crall et al., 2013;Reijns, 2014). Finally, we compare whether there is a difference in the accuracy of each software package between two populations with differing coat patterns.

| Image datasets
To examine whether the performance of feature-based image-

| Preprocessing steps
To automate the selection of suitable images for image matching, we developed a five-step image preprocessing method ( Figure 2).

| Detecting and cropping individuals from images
The aim of the first step in the image preprocessing method was to automatically detect and crop wild dog individuals from the images.
To do this, we used the Microsoft AI for Earth MegaDetector

| Aspect-ratio filtering
The aim of the second step in the image preprocessing method was to filter out images that were unsuitable for identification due to the individual's body rotation in the image, or because of occlusion of the animal's flank by another object or individual. This down-selection of images ensures that all images depict individuals from roughly the same viewpoint, to maximize the confidence with which images can be matched. We considered crops suitable for image matching if approximately ≥80% of the individual's flank was visible, and the angle between the image axis and animal's flank was less than approximately 30°, that is, the flank was facing the camera. Crops where the angle between the image axis and the animal's flank was more than 30°, and crops where a part of the flank is obscured by another animal were expected to be narrower than crops suitable for image matching and therefore demonstrate a relatively low aspect ratio. By contrast, crops where the flank was concealed because the individual was lying down, or obscured by vegetation, were expected to be considerably wider and demonstrate a relatively high aspect ratio. These criteria were visually assessed for the crops that the MegaDetector produced in the previous step. We then calculated the range of aspect ratios for suitable crops, that is, where an unobscured flank was facing the camera, using the "jpeg" package (Urbanek, 2021) in program R (version 4.0.4, R Core Team, 2020).
Images with an aspect ratio outside of this range were removed from the dataset.

| Selecting standing individuals
Not all sitting or lying individuals could be filtered out solely using image aspect ratios. Therefore, the aim of the third step in the image preprocessing method was to filter out the remaining crops that were unsuitable for identification because the individual's body position, that is, sitting or lying, obscured the full coat pattern. To do this, we trained a CNN to classify crops as either a standing wild dog or a sitting wild dog. To obtain data to train this image classifier, we used the full image catalogs from both sites (n = 11,205). The crops produced by steps 1 and 2 of the preprocessing (n = 21,745) were then manually classified as containing either a standing wild dog (n = 13,500) or a sitting wild dog (n = 6512). We removed all crops depicting anything other than wild dogs (e.g., birds, rocks, or logs), or wild dogs where it could not be confirmed whether they were standing or sitting, because only a small part of the animal was visible (n = 1733). We trained a CNN using the remaining 20,012 preprocessed crops, to classify these as containing either a standing wild dog or not. The CNN was made using TensorFlow (Abadi et al., 2016)

| Image background removal
Lastly, we removed the image backgrounds of suitable images using the "rembg" package in Python (Gatis, 2020). We removed image backgrounds to remove the risk of the background confounding image-matching results, while eliminating the need to manually select an individual's flank.

| Image-matching software packages
We compared the performance of three feature-based imagematching software packages that differ in the underlying algorithms used to match individuals: I 3 S-Pattern (Reijns, 2014), WildID , and Hotspotter (Crall et al., 2013). All three assist the user by listing potential matches for each image, ranked by a similarity score. The user then confirms which of these potential matches are true matches.

| I 3 S-Pattern
I 3 S-Pattern uses the Speeded Up Robust Features (SURF) algorithm (Bay et al., 2008;Reijns, 2014) that selects key points and compares each image pair in a dataset based on the size and position of these key points. The software requires the user to select three reference points per image, as well as the outline of the animal. As reference points, we used the base of the tail, the withers (i.e., the ridge between the shoulder blades), and the base of the neck ( Figure S1).

| Performance of the image-matching software
To test which image-matching software most accurately matched crops of the same individual, we created two separate datasets for the Kenyan and Zimbabwean populations. To select suitable crops, we used the four-step image preprocessing method described above. We also visually inspected discarded crops to avoid missing suitable crops. We then manually identified individuals from the dataset of right flank crops, to provide a standard against which automated identifications could be compared, and randomly se- Previous studies have shown that the image-matching performance of different software packages is affected by database size (Matthé et al., 2017). Therefore, to compare software performance on wild dogs from Kenyan and Zimbabwean populations, we randomly selected a subset of the Kenyan individuals to equal the number of identified individuals in the Zimbabwean dataset (n = 89). We then used the best performing software package identified in the previous step of the analysis to rerun the image-matching analysis for both datasets. Differences in software performance between the two populations were then assessed using a mixed effects logistic regression with a binomial link function. The response variable in the model was whether or not a match was detected in the first 10-ranked images, and study site (Kenya or Zimbabwe) was the explanatory variable. To correct for possible differences in image quality, two proxies for image quality were included in the model.
First, we included image size (total number of pixels of the crop) as a continuous predictor. Second, all images were visually scored on a scale of 1-3, based on how well their distinct marks could be recognized. This approach followed Nipko et al. (2020), where score 1 was given to images that were out of focus, of a moving animal, or badly lit, score 2 was given to images of intermediate quality, and score 3 was given to images where all features were clearly visible (e.g., see Figure S3). Score was included as a fixed effect and individual identity was included as a random effect.
Furthermore, a Wilcoxon rank sum test was performed to test for differences between the quality score of crops from Kenya and Zimbabwe. The model was fit using the "lme4" package (v 1.1-27.1, Bates et al., 2015) in R (version 4.0.4, R Core Team, 2020).  The probability of accurate image matching occurring within the first 10-ranked images was significantly higher for wild dogs from Zimbabwe than for wild dogs from Kenya (OR = 9.64, 95% CI 3.65-15.63, Figure 4). The proportion of matched individuals identified in this analysis was not significantly associated with image size (X 2 1 = 0.16, p = .69) or image quality (OR Quality Score 2/Quality Score 1 = 0.89, 95% CI −2.26 to 4.04, OR Q uality Score 3/Q uality Score 1 = 1.82, 95% CI −2.20 to 5.83). In addition, the image quality score did not differ between the populations (W = 15,008, p = .33).

| DISCUSS ION
This study presents a novel framework for automating the individual recognition of species with distinct marks. The framework includes an automated preprocessing method for identifying images suitable for image matching, and then using image-matching software for individual recognition. The automated preprocessing method consists of five steps that (1) crop all images containing animals from a large database, (2) filter out a portion of the unsuitable images based on image aspect ratio, (3) use CNNs to select images of standing individuals (accuracy of 90%), (4) separate images into left and right flanks (accuracy of 95%), and (5) remove image backgrounds. As a case study, we applied the described methods to an image catalog of African wild dogs and found that Hotspotter (Crall et al., 2013) was the most efficient software package for matching images. Image-matching performance was also significantly improved by using the full image of an individual from which the background was removed, as opposed to just the cropped flank.
Finally, we found that image-matching performance differed between populations of wild dogs with different coat coloration patterns. This work showed that image-matching software could become a powerful method for monitoring populations of African wild dogs. However, caution is needed as detection rates are likely to vary between-and even within-populations. This could affect the certainty of derived population-specific demographic parameters, such that careful consideration is needed to account for individual heterogeneity in detection when large variation in coat coloration occurs within a population.
The automated preprocessing method presented in this study could eliminate the need to manually select suitable images for image matching and crop individuals from original photographs. This method thus enables processing of large image catalogs where selection using visual inspection would be extremely time-consuming.
We found that the method does discard a small number of suitable images, and therefore in situations where it is important to include all suitable images, the preprocessing method outlined here could also be used as a presorting approach. The user could then visually review images that were classified as not suitable, to prevent usable images from being discarded.
The described method of preprocessing is particularly useful for wild dogs, since an individual's posture varies substantially between images. Images taken by tourists provide an opportunity to bolster and spatially extend image catalogs. However, these images are also likely to contain many images unsuitable for identification, as they are not taken for the purpose of identification. Accordingly, filtering unsuitable images from these datasets using an automated approach could be especially timesaving. The described preprocessing method is therefore highly suitable to species targeted by wildlife-watching excursions, that have distinctive marks and where individual TA B L E 2 The accuracy and 95% confidence intervals of the best performing Convolution Neural Networks aiming to classify images of African wild dog into (1) those depicting an individual standing up and not standing up and (2) those depicting left or right flanks.  et al., 2021), as well as studies comparing Hotspotter and WildID (Burgstaller et al., 2021;Chehrsimin et al., 2018;Nipko et al., 2020).

Model
Nevertheless, this result is not ubiquitous. WildID was superior to Hotspotter at matching images for a blotched amphibian species, the Wyoming toad, Anaxyrus baxteri (Morrison et al., 2016). This indicates that the identification performance of different software packages is dependent on species, even when two species' patterns show similarities. Consequently, we recommend that all three software packages are tested on new species before deciding on which one to use.
Using crops of full individuals from which the background was on Saimaa ringed seals, Pusa hispida, and Thornicroft's giraffes, G.
camelopardalis thornicrofti, which found evidence that using a full individual from which the background is removed, could result in a higher accuracy (Chehrsimin et al., 2018;Halloran et al., 2015).
However, neither of these previous studies statistically tested whether background removal increased identification accuracy. Our study therefore provides the first statistical evidence that background removal can increase the performance of image-matching software. This also indicates that the common usage of Hotspotter, in which a rectangular region of interest is manually cropped (e.g., Dunbar et al., 2021;Nipko et al., 2020), could be improved by removing the image background.
Hotspotter was significantly better at matching images from Zimbabwean wild dogs, compared to Kenyan individuals. The higher image-matching accuracy found for the Zimbabwean population is likely to reflect the regional difference in wild dog coat coloration patterns. The Kenyan population has darker, more uniform coats, consisting of large black patches, often with few white or tan areas (Daniels et al., 2022;McIntosh et al., 2016). By contrast, the proportion of tan fur is ~1.5 times higher, and the proportion of white fur is almost seven times higher for the Zimbabwean population ( Figure 1, Daniels et al., 2022). Therefore, the higher contrast within the patterns of the Zimbabwean wild dogs could make it easier for the software to match images of these individuals. The identified relationship between image-matching performance and software package remained unaltered when image quality and image size were included in analyses, and there was no significant difference between the image quality scores between the Zimbabwean and Kenyan populations. The image quality score approach was modeled after Nipko et al. (2020), who found that it significantly affected the probability of matching ocelot and jaguar individuals. As a result, we are confident that the differences in coat coloration patterns between wild dogs from Zimbabwe and Kenya reflect variation in identification performance between populations.
Interpopulation variation in image-matching performance indicates that detection probabilities derived from using this approach will not be directly comparable between populations. Since the probability of finding an accurate image match depends on individual coat pattern, this finding highlights that individual heterogeneity in detection may also occur if large variation in coat coloration occurs within a population. Capture-mark-recapture techniques assume individuals' experience equal detection probability across a population (White & Burnham, 2009). Therefore, individual coat pattern may also need accounting for when deriving survival estimates using such analysis. This also applies to other species whose coat pattern varies regionally, such as Asian golden cats, Catopuma temminckii, and ocelots (Allen et al., 2011;Khan et al., 2017). Furthermore, the coat patterns of other wild dog populations can differ considerably from the two populations included in this study (Daniels et al., 2022, McIntosh et al., 2016. Consequently, we advocate that estimating a population-specific image-matching accuracy score becomes an essential prerequisite step for applying these techniques in different locations.
Automatically preprocessing wild dog image datasets and using Data obtained in this way could provide cost-effective large-scale monitoring for endangered species, therefore supporting the implementation of effective conservation.

DATA AVA I L A B I L I T Y S TAT E M E N T
The code used for data analysis and image preprocessing, and all results, are available on Zenodo: https://zenodo.org/recor d/77128 53#.ZAnrS y-l0Vw. The full image datasets are available on request.